Goto

Collaborating Authors

 locomotion task


Variable-Impedance Muscle Coordination under Slow-Rate Control Frequencies and Limited Observation Conditions Evaluated through Legged Locomotion

Asai, Hidaka, Noda, Tomoyuki, Morimoto, Jun

arXiv.org Artificial Intelligence

Human motor control remains agile and robust despite limited sensory information for feedback, a property attributed to the body's ability to perform morphological computation through muscle coordination with variable impedance. However, it remains unclear how such low-level mechanical computation reduces the control requirements of the high-level controller. In this study, we implement a hierarchical controller consisting of a high-level neural network trained by reinforcement learning and a low-level variable-impedance muscle coor dination model with mono- and biarticular muscles in monoped locomotion task. We systematically restrict the high-level controller by varying the control frequency and by introducing biologically inspired observation conditions: delayed, partial, and substituted observation. Under these conditions, we evaluate how the low-level variable-impedance muscle coordination contributes to learning process of high-level neural network. The results show that variable-impedance muscle coordination enables stable locomotion even under slow-rate control frequency and limited observation conditions. These findings demonstrate that the morphological computation of muscle coordination effectively offloads high-frequency feedback of the high-level controller and provide a design principle for the controller in motor control.


MS-PPO: Morphological-Symmetry-Equivariant Policy for Legged Robot Locomotion

Wei, Sizhe, Chen, Xulin, Xie, Fengze, Katz, Garrett Ethan, Gan, Zhenyu, Gan, Lu

arXiv.org Artificial Intelligence

Reinforcement learning has recently enabled impressive locomotion capabilities on legged robots; however, most policy architectures remain morphology- and symmetry-agnostic, leading to inefficient training and limited generalization. This work introduces MS-PPO, a morphological-symmetry-equivariant policy learning framework that encodes robot kinematic structure and morphological symmetries directly into the policy network. We construct a morphology-informed graph neural architecture that is provably equivariant with respect to the robot's morphological symmetry group actions, ensuring consistent policy responses under symmetric states while maintaining invariance in value estimation. This design eliminates the need for tedious reward shaping or costly data augmentation, which are typically required to enforce symmetry. We evaluate MS-PPO in simulation on Unitree Go2 and Xiaomi CyberDog2 robots across diverse locomotion tasks, including trotting, pronking, slope walking, and bipedal turning, and further deploy the learned policies on hardware. Extensive experiments show that MS-PPO achieves superior training stability, symmetry generalization ability, and sample efficiency in challenging locomotion tasks, compared to state-of-the-art baselines. These findings demonstrate that embedding both kinematic structure and morphological symmetry into policy learning provides a powerful inductive bias for legged robot locomotion control. Our code will be made publicly available at https://lunarlab-gatech.github.io/MS-PPO/.



51200d29d1fc15f5a71c1dab4bb54f7c-AuthorFeedback.pdf

Neural Information Processing Systems

We would like to thank our reviewers for their thoughtful comments and feedback. However, to preserve anonymity, we can not share the link to the repository. Our most challenging tasks are locomotion tasks, which are not well suited for human demonstrations. But we believe this is an important direction for research as well. We will add this rationale to the paper.


Appendix Meta-Learning with Self-Improving Momentum Target A Overview of terminologies used in the paper

Neural Information Processing Systems

The meta-learner network, i.e., learns to generalize on new tasks. A dataset sampled from a given task distribution that is used for the adaptation. ANIL [ 36 ] only adapts the last linear layer of the meta-model to obtain . The aim of metric-based meta-learning is to perform a non-parametric classifier on top of the meta-model's embedding space Require: Distribution over tasks p (), adaptation subroutine Adapt (), momentum coefficient, weight hyperparameter, task batch size N, number of rollouts per task K, learning rate . Here, we describe the detailed objective of the meta-RL used in the experiments.



Evolving Connectivity for Recurrent Spiking Neural Networks Guan Wang 1, 2, Y uhao Sun

Neural Information Processing Systems

Recurrent spiking neural networks (RSNNs) hold great potential for advancing artificial general intelligence, as they draw inspiration from the biological nervous system and show promise in modeling complex dynamics.